arXiv: 1505.00442vl [astro-ph.CO] 3 May 2015 


Astronomy & Astrophysics manuscript no. ms © ESO 2015 

May 5, 2015 


The VIMOS Public Extragalactic Redshift Survey (VIPERS) * 

On the correct recovery of the count-in-cell probability distribution function. 

J. Bel 2 , E. Branchini 10,28 29 , C. Di Porto 9 , O. Cucciati 17,9 , B. R. Granett 2 , A. Iovino 2 , S. de la Torre 4 , C. Marinoni 7,30,31 , 
L. Guzzo 2,27 , L. Moscardini 1718,9 , A. Cappi 9,21 , U. Abbas 5 , C. Adami 4 , S. Arnouts 6 , M. Bolzonella 9 , D. Bottini 3 , 

J. Coupon 32 ,1. Davidzon 917 , G. De Lucia 13 , A. Fritz 3 , P. Franzetti 3 , M. Fumana 3 , B. Garilli 3,4 , O. Ilbert 4 , J. Krywult 15 , 
V. Le Brun 4 , O. Le Fevre 4 , D. Maccagni 3 , K. Malek 23 , F. Marulli 1718,9 , H. J. McCracken 19 , L. Paioro 3 , M. Polletta 3 , 
A. Polio 22,23 , H. Schlagenhaufer 24,20 , M. Scodeggio 3 , L. A. .M. Tasca 4 , R. Tojeiro 11 , D. Vergani 25,9 , A. Zanichelli 26 , 
A. Burden 11 , A. Marchetti 1,2 , Y. Mellier 19 , R. C. Nichol 11 , J. A. Peacock 14 , W. J. Percival 11 , S. Phleps 20 , and 

M. Wolk 19 

(Affiliations can be found after the references) 

Received accepted - 


ABSTRACT 

We compare three methods to measure the count-in-cell probability density function of galaxies in a spectroscopic redshift survey. From this 
comparison we found that when the sampling is low (the average number of object per cell is around unity) it is necessary to use a parametric 
method to model the galaxy distribution. We used a set of mock catalogues of VIPERS, in order to verify if we were able to reconstruct the 
cell-count probability distribution once the observational strategy is applied. We find that in the simulated catalogues, the probability distribution 
of galaxies is better represented by a Gamma expansion than a Skewed Log-Normal. Finally, we correct the cell-count probability distribution 
function from the angular selection effect of the VIMOS instrument and study the redshift and absolute magnitude dependency of the underlying 
galaxy density function in VIPERS from redshift 0.5 to 1.1. We found very weak evolution of the probability density distribution function and that 
it is well approximated, independently from the chosen tracers, by a Gamma distribution. 

Key words. Cosmology: cosmological parameters - cosmology: large scale structure of the Universe - Galaxies: high-redshift - Galaxies: statis¬ 
tics 


1. Introduction 

The galaxy clustering offers a formidable playground to try to 
understand how structures have been growing during the evolu¬ 
tion of the universe. A number of statistical tools h ave b een de¬ 
veloped and used over the past thirty years (see lBernardeau etldl 
120021 for a review). In general, these statistical methods use the 
fact that the clustering of galaxies is due to the gravitational pull 
of the underlying matter distribution. Hence, the study of the 
spatial distribution of galaxies in the universe allows us to get 
information about the statistical properties of its matter content. 
As a result, it is of paramount importance to be able to measure 
the statistical quantities describing the galaxy distribution from 
a redshift survey. 
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The development of multi-object spectrographs on 8-m class 
telescopes during the 1990s triggered a number of deep redshift 
surveys with measu red distances beyond z ~ 0.5 over areas of 1 - 
2 deg 2 (e.g. WPS iLe Fevre e t al. 20051 DEEP2 iNewman et al.l 
120Inl and zCOSMOS Lilly et al.ll2009l) . Even so, it was not until 
the wide extension of VVDS was produced dGarilli et al.ll2008h . 
that a survey existed with sufficient volume to attempt cosmo - 
logically meaningful computations at z ~ 1 (iGuzzo et al.1120081) . 
In general, clustering measurements at z — 1 from these samples 
remained dominated by cosmic variance, as dramatically shown 
by the discrepancy observed between the VVDS and zCOSMOS 
correlation functions at z — 0.8 dde la Torre et al.lf2010l) . 

The VIMOS Public Extragalactic Redshift Survey (VIPERS) 
is part of this global attempt to take cosmological measurements 
at z ~ 1 to a new level in terms of statistical significance. In 
contrast to the BOSS and WiggleZ surveys, which use large- 
field-of-view (~ 1 deg 2 ) fibre optic positioners to probe huge 
volumes at low sampling density, VIPERS exploits the features 
of VIMOS at the ESO VLT to yield a dense galaxy sampling 
over a moderately large field of view (~ 0.08 deg 2 ). It reaches 
a volu me at 0.5 < z < 1.2 comparable to that of the 2dFGRS 
dColless et al.ll200H) at z ~ 0.1, allowing the cosmological evo¬ 
lution to be tested with small statistical errors. 

The VIPERS redshifts are being collected by tiling the 
selected sky areas with a uniform mosaic of VIMOS fields. 
The area covered is not contiguous, but presents regular gaps 
due to the specific footprint of the instrument field of view, 
in addition to intrinsic unobserved areas due to bright stars 
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or defects in the original photometric catalogue. The VIMOS 
field of view has four rectangular regions of about 8x7 
square arcminutes each, separated by a n unobserved cross 
dGuzzo et alil2014l ide la Torre et al.ll2013 lf). This creates a reg¬ 
ular pattern of gaps in the angular distribution of the mea¬ 
sured galaxies. Additionally, the Target Sampling Rate and the 
Survey Success Rate vary among the quadrants, and a few 
of the latter were lost becau se of mechanical problems within 
VIMOS (iGarilli et~aD 20141). Fi nally, the slit-positioning algo¬ 
rithm, SPOC (see iBottini et al.l l2005i) . also introduces some 
small-scale angular selection effects, with different constraints 
along the dispersion and spatia l directions of the spectra, as thor¬ 
oughly discussed in Ide la Torre et all ( 2013 ). Clearly, this com¬ 
bination of angular selection effects has to be taken properly into 
account when estimating any clustering statistics. 

In this paper we measure the probability distribution func¬ 
tion from the VIPERS Public Data Release 1 (PDR-1) redshift 
catalogue, including ~ 64 % of the final number of redshifts ex- 
pected at completion (see lGuzzo et alJl2014t IGarilli et al.ll2014l 
for a detailed description of the survey data set). The paper is or¬ 
ganized as follows. In §2 we introduce the VIPERS survey and 
the features of the PDR-1 sample. In §3 we review the basics of 
the three methods we compared. In §4 we present a null test of 
the three method on a synthetic galaxy catalogue. In §5 we use 
galaxy mock catalogues to assess performances of two of the 
methods. Magnitude and redshift dependance of the probability 
distribution function of VIPERS PDR-1 galaxies are presented 
in §6 and conclusions are drawn in §7. 

Throughout, the Hubble constant is parameterized via h - 
Hq /100 km s~*Mpc~ 1 , all magnitudes in this paper are in the AB 
system (lOke & Gunnlll983l) and we will not give an explicit AB 
suffix. In order to convert redshifts into comoving distances we 
assume that the matter density parameter is £2„, = 0.27 and that 
the universe is spatially fiat with a ACDM cosmology without 
radiations. 


2. Data 

The VIMOS Public Extragalactic Redshift Survey (VIPERS) is a 
spectroscopic redshift survey being built using the VIMOS spec¬ 
trograph at the ESO VLT. The survey target sample is selected 
from the Canada-France-Hawaii Telescope Legacy Survey Wide 
(CFHTLS-Wide) optical photometric catalogues (IMellier et al.l 
l2009h . The final VIPERS will cover ~ 24 deg 2 on the sky, di¬ 
vided over two areas within the W1 and W4 CFHTLS fields. 
Galaxies are selected to a limit of iah < 22.5, further apply¬ 
ing a simple and robust gri colour pre-selection, as to effectively 
remove gala xies at z < 0.5. Coupled to an aggressive observ¬ 
ing strategy dScodeggio et al.ll2009l) . this allows us to double the 
galaxy sampling rate in the redshift range of interest, with re¬ 
spect to a pure magnitude-limited sample (~ 40%). At the same 
time, the area and depth of the survey result in a fairly large 
volume, ~ 5 x 10 7 h 3 Mpc \ analogous to that of the 2dFGRS 
at z ~ 0.1. Such combination of sampling and depth is quite 
unique over current redshift surveys at z > 0.5. The VIPERS 
spectra are_collected with the VIMOS multi-object spectrograph 
dLe Fevre et al.l 12003 ) at moderate resolution (R = 210), using 
the LR Red grism, providing a wavelength coverage of 5500- 
9500A and a typical redshift error of 141(1 + z) km sec” 1 . The 
full VIPERS area is covered through a mosaic of 288 VIMOS 
pointings (192 in the W1 area, and 96 in the W4 area). A discus¬ 
sion of the survey data_ reduction and management infrastruc¬ 
ture is presented in iGarilli et al. ( 2012 ). An early subset of the 


Table 1 . List of the magnitude selected objects (in B-band) in 
the VIPERS PDR-1 


£min 

£min 

luminosity 

M s - 5 log(/7) < 

p (Eq.|JJ 

10” 3 /7 3 Mpc- 3 

0.5 

0.7 

-18.6 - z 

4.49 

0.5 

0.7 

-19.1 - z 

2.96 

0.5 

0.7 

-19.5-z 

1.88 

0.5 

0.7 

-19.7-z 

1.43 

0.5 

0.7 

-19.9-z 

1.04 

0.7 

0.9 

-19.1 -z 

2.47 

0.7 

0.9 

-19.5-z 

1.66 

0.7 

0.9 

-19.7-z 

1.25 

0.7 

0.9 

-19.9-z 

0.912 

0.9 

1.1 

-19.5-z 

0.622 

0.9 

1.1 

-19.7-z 

0.535 

0.9 

1.1 

-19.9 - z 

0.425 


spectra used here is analyzed and cl assi fied through a Principal 
Component Analysis (PCA) in lMarchetti et al . ( i201 2). 

A quality flag is assigned to each measured redshift, based on 
the quality of the corresponding spectrum. Here and in all par¬ 
allel VIPERS science analyses we use only galaxies with flags 
2 to 9 inclusive, corresponding to a global redshift confidence 
level of 98%. The redshift confirmation rate and redshift accu¬ 
racy have been estimated using repeated spectroscopic observa¬ 
tions in the VIPERS fields. A more complete description of the 
survey construction, from the definition of the target sample to 
the actual spectra and redshift measu rement s, is g iven in the par¬ 
allel survey description paper dGuzzo et al.ll2014i) . 

The data set used in this paper and the other papers of 
this early science release is the VIPERS Public Data Release 
1 (PDR-1) catalogue, which have been made publicly available 
in September 2013. This includes 55,359 objects, spread over a 
global area of 8.6 x 1.0 deg 2 and 5.3 x 1.5 deg 2 respectively in 
W1 and W4. It corresponds to the data frozen in the VIPERS 
database at the end of the 2011/2012 observing campaign, i.e. 
64% of the final expected survey. For the specific analysis pre¬ 
sented here, the sample has been further limited to its higher- 
redshift part, selecting only galaxies with 0.55 < z < 1.1. The 
reason for this selection is related to minimizing the shot noise 
and maximizing the volume. This reduces the usable sample to 
18135 and 16879 galaxies in W1 and W4 respectively (always 
with quality flags between 2 and 9). The corresponding effective 
volume of the two samples are 6.57 and 6.14 xlO 6 h 3 Mpc 3 . 
At redshift z = 1.1 they spann respectively the angular comov¬ 
ing distances ~ 370 and 230/r'Mpc. We divide the W1 and W4 
fields in three redshift bins and we build magnitude limited sub¬ 
samples in each of them. Forconvenience, we use the magnitude 
limits listed in Table (1) of ldi Porto et al.l ( 2014 ). which we recall 
in Tab. ([Tji. 

The VIMOS footprint has an important impact on the ob¬ 
served probability of finding N galaxies in a randomly placed 
spherical cell in the survey volume. As a matter of fact, a direct 
appreciation of the masked area can be shown on the first mo¬ 
ment of the probability distribution, i.e. the expectation value of 
the number count N = 2^=0 NP^. On one hand, we can predict 
the mean number of objects per cells from the knowledge of the 
number density in each considered redshift bins and on the other 
hand we can estimate it by placing a regular grid of spherical 
cells of radius R into the volume surveyed by VIPERS. In fact, 
given the solid angle of W1 and W4 and the corresponding num- 
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Fig. 1. Upper. Expected mean number count in spheres (solid line, from Eq. [2} with respect to the observed one (symbols) for the various 
luminosity cuts and for the three redshift bins [0.5,0.7] (left panel), [0.7,0.9] (central panel), and [0.9,1.1] (right panel). The selection in absolute 
magnitude M B in B-band corresponding to each symbols/lines and colors are indicated in the inset. The dotted line displays the N = 1. Lower. 
Displays the deviation a (see Eq. [3j between the expected mean number Nr and the observed one N with respect to the radius R of the cells. 


ber of galaxies N\ and /V 4 contained in a redshift bin extracted 
from each field, one can estimate the total number density as 

_ N l+ N 4 1 

P Q, + D 4 Vk ( 

where Vk is defined as the volume corresponding to a sector 
of a spherical shell with solid angle equal to unity. In the case 
of VIPERS PDR-1 the effective solide angles corresponding 
to W1 and W4 are respectively Oi = 1.6651683 x 1CT 3 and 
£1* = 1.5573021 x 10 3 (in square radians). One can therefore 
predict the corresponding expected number of objects in each 
cell by multiplying the averaged number density by the volume 
of a cell. It reads 

Mr = ( 2 ) 

in the case of the spherical cells of radius R considered in this 
work. The expectation value Nr with respect to the radius of 
the cells corresponding to each luminosity sub-sample extracted 
from VIPERS-PDR1 is represented by lines in Fig. ([]}. On the 
same figure we display the measured mean number of object N 
in each redshift bins. Note that to perform this measurement we 
place a grid of equally separated (4/; 'Mpc) spheres of radius 
R = 4,6,8/; 'Mpc and we reject spheres with more than 40% of 
their volume outside the observed region (see lBel et al.ll2014h . 
We quantify the effect of the mask using the quantity 


JL 

A Tr 


(3) 


in fact the botton panels of Fig. ([T]) shows that for all sub sam¬ 
ples and at all redshifts the neat effect of the masks is to under¬ 
sample the galaxy field by roughly 72%. It also shows that the 
correction factor a depends on the considered redshift, on the 


luminosity and on the cell-size. The scale dependency can be 
explained by the fact that the correction parameter a depends on 
how the cells overlap with the masked regions. The left panel 
of Fig. ([[} suggests that at low redshift the mask effect behaves 
in the same way for all the luminosity samples while the mid¬ 
dle panel shows a clear dependency with respect to luminosity. 
The correction factor a depends on the redshift distribution, as a 
result the apparent dependency with respect to the luminosity is 
due to the dependence of the number density with respect to the 
luminosity of the considered objects. 

The mask not only modifies the mean number of object but 
it also modifies the higher order moments of the distribution, 
such that the measured TV will be systematically altered. In the 
present paper we show that this systematic effect can be taken 
into account by measuring the underlying probability density 
function of the galaxy density contrast 6. It has been shown (see 
Fig. (8) of lBel et aT1l2014l) that after rejecting spheres with more 
than 40% of their volume outside the survey, the local poisson 
process approximation holds. In particular, it allows to use the 
“wrong” probability distribution function in order to get reliable 
information on the underlying probability density function p(6). 
Then applying the Poisson sampling one can recover the unal¬ 
tered TV using that N = N{masked)/a. For the sake of com¬ 
pleteness we provide the reader with the measured probability 
function obtained after rejecting the cells with more than 40% 
of their volume outside the survey (see Fig. El). 

In particular, let Pm and TV, respectively, be the observed 
and the true Counting Probability Distribution Function (CPDF). 
Assuming that from the knowledge of Pm there exists a process 
to get the underlying probability density function of the stochas¬ 
tic field A, which is associated to the random variable N, one can 
compute the true CPDF applying 

fOO 

P N = P[N\A]p(A)dA, (4) 

Jo 
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where / J [|A] is called the sampling conditional probability; it 
determines the sampling process from which the discrete cell- 
count arises. In the following we assume that this sampling con¬ 
ditional probability follows a Poisson law dLavzeiltl956 ). as a 
result in Eq. (Q} we substitute 

P[N\A] = K[N,A] = ^e~ A . (5) 

It is also convenient to express Eq. (JTJi in terms of the density 
contrast of the stochastic field A, <5 = A/A-l,it follows that 

X oo 

K[N\N(1 + S)]p(S)d6, (6) 

where we used that A = N, which is a property of the Poisson 
sampling. 

Continuing along this direction that we propose to compare 
three methods which aim at extracting the underlying probability 
density function (PDF) in order to correct the observed CPDF 
from the angular selection effects of VIPERS. 

3. Methods 

In this section we review the PDF estimators that we use and 
compare with each others in this paper. The purpose is to select 
the method which will be more adapted to the VIPERS charac¬ 
teristics. 

3.1. The Richardson-Lucy deconvolution 


3.2. The Skewed Log-Normal 

This is a parametric method where the shape of the probability 
density depends on a given number of parameters, in this case the 
probability density function is assum ed to be well described by 
a Skewed Log-Normal (IColombil 19941) d i stributio n. It is derived 
from the Log-Normal distribution (IColes & Jones|[T99lt) but it is 
more flexible. It is indeed built upon an Edgeworth expansion; be 
the stochastic field O = ln( 1 +6), following a Normal distribution 
then the density contrast 5 follows instead a Log-Normal distri¬ 
bution. In the case of the Skewed Log-Normal (SLN) density 
function, the field O follows an Edgeworth expanded Normal 
distribution 

P*m = jl + + ^j-H 4 (v) + ^<y 3 )2// 6 (v)) —, (7) 

[ 6 24 72 J o"o 

where v = —T± G is the central reduced Normal distribution 

CT <j) 

G(v) = '-^= and (v n ) c denotes the cumulant expectation value 
of v. As a result, the SLN is parameterized by the four param¬ 
eters cr®, (y 3 ) c , and (v 4 ) f which are related, respectively to 
the mean, the dispersion, the skewness and the kurtosis of the 
stochastic variable d>. They can all be expressed in terms of cu- 
mula nts (Q") r of order n of the weakly non-Gaussian field <I>. 
In iSzapudi & Panl (120041) they use a best fit approach and deter¬ 
mine these parameters by minimizing the difference between the 
measured counting probability TV and the one obtained from 


This is an iterative method which aims at inverting Eq. (|6) with¬ 
out parametrizing the underlying PDF, it has been investigated 
by ISzapudi & Panl (120041) . This method starts with an initial 
guess po for the probability density function p which is used 
to compute the corresponding expected observed TV,o via 


P NX) - 


X oo 

1 Afv,A(l +b)]po(b)dA 


where K [(V, A(1 + b)j = K/ JV K. The probability density func¬ 
tion used at the next step is obtained using 


Pi +l (6) = Pi(6) 2 ^ K [n, m + b)], 


N=0 


?N,i 


pth _ 
r N ~ 


= /“t:[a,A( 1 + <5)]P®[ln(l +b), A / 4 .,cr 3 ,(<I) 3 ) c ,<0 4 ) c 

xdlnfl + 6). (8) 


where p = p Y,n K. For each step the agreement between the 
expected observed probability distribution TV,, and the true one 
TV is quantified by 


Xi 


N =0 


(A A 

A are given by 

r 

Wi ) 


<A”> = 

Jo 


However, this requires us to perform the integral (Eq.[8ji in a four 
dimensional parameter space which is numerically expensive. 

In the present paper we use an alternative implementation 
which is computationally more efficient. Instead of trying to 
maximize the likelihood of the model given the observations, 
we rather use the observations to predict the parameters of the 
SLN. To do so we use the p roperty of the local Poisson sam¬ 
pling dBel & Marinoni|[2012l) : the factorial moments (( N)") of 
the discrete counts are equal to the moments of the underlying 
continuous distribution (A"). Since the transformation between 
the density contrast 5 and the Edgeworth expanded field <I> is lo¬ 
cal and deterministic, it is possible to find a relation between the 
moments (A") and the cumulants (0") c . 

By definition, the moments of the positive continuous field 


A"P(A)dA, 


It is therefore possible to know the evolution of the cost function 
X 1 with respect to the steps i. 

In fact it has been shown by ISzapudi & Panl (120041) that it 
converges toward a constant value which corresponds to the 
best evaluation of the probability density function p given the 
observed probability distribution /V. Since these authors have 
shown that this convergence occurs after around 30 iterations. 
We did our own convergence tests which have shown that adopt¬ 
ing a value of 30 iterations is enough. However, it happens that 
the evolution of the y 2 is not always monotonic. In practice, we 
store the^ 2 result of each step and we look for the step for which 
the^f 2 is minimum, i.e. p{6) — pi mjn {6). As an initial guess we set 
that the discret CPDF is equal to the continuous one (po(6) = p). 


then for a local deterministic transformation the conservation of 
probability imposes T*(A)dA = WlfKl'f), it follows that the mo¬ 
ments of A can be recast in terms of d? 

/-•OO 

(A”) = A" I e"‘ , 7V(dPdd>. 

Jo 

In the right hand side one can recognize the definition of the 
moment generating function M<s>(t) = (e f0 ) we therefore obtain 
that 

(A") 

M®(f = n) = —— = A„. (9) 
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This equation allows us to link the moment of A to the cumulants expansion and therefore depend on the moments of the galaxy 
of <t> via the moment generating function At®. field A 

Moreover, since the probability density I 1 ,/, is the product of 
a sum of Hermite polynomials with a Gaussian function it is 
straightforward to compute the explicit expression of the mo¬ 
ment generating function we obtain 


Z 

1=0 


m , 1V <A'> 

i)T(k + iy 8‘ 


(15) 


A4(r) = \ 1 + (O 3 )^ + {&)/— + <0 3 ) f 2 ^t 6 


//%+r 


( 10 ) 


As a matter of fact, Eq. (flOb and Eq. Q together allow to set up 
a system of four equations, for « = 1,2,3,4 it reads 


Y n X n B„ = A„ 


( 11 ) 


where Y = , X = e^ and B n = M<n(t - u,/lz® — 0, <x® = 0). 

In the system of equations (Eq. ITU the right hand side is given 
by observations and the left hand side depends on the cumulants 
i u®, cr 3 , (®) 3 and parameterized in terms of X, Y, x s (®) 3 

and v = (®) 3 . In appendix we detail the procedure to solve 
this non-linear system of equations. 

We therefore, get the values of the four parameters of the 
SLN by simply measuring the moments of the counting variable 
N up to the fourth order. 


The main interrest of the Gamma expansion with respect to the 
SLN is that the coefficients of the expansion are directly related 
to the moments of the distribution we want to model, i.e. it is 
not necessary to solve a complicated non-linear system of equa¬ 
tions nor to perform a Likelihood estimation of the coefficients. 
Moreover, it can be easily performed at higher order to describe 
as best as possible the underlying probability density function of 
galaxies. 

Another advantage of describing the galaxy field A by a 
Gamma expansion probability density function is that the cor¬ 
responding observed can be expressed analytically, which is 
not the case for the SLN which must be integrated numerically. 

In Appendix (|B} we demonstrate the previous statement, it 
follows that the CPDF P,v can be calculated from 


Pn = 


(~g) A 

N\ 


V . r (« + £),(A0 fn 

X C: -mr h ' 


(16) 


3.3. The Gamma expansion 

The Gamma expansion method follows the same idea as de¬ 
scribed in 33.21 but it uses a Gamma distribution instead of 
a Gaussian one. It uses the orthogonality properties of the 
Laguerre polynomials in order to modify the moments of the 
Gamma PDF. Such an expansion has been investigated in 
iGaztanaga, Fosalba & Elizaldel d2000h where they compared it 
to the Edgeworth expansion in order to model the one-point 
PDF of the matter density field. Then it has been further ex¬ 
tended, in a m ore general contex t, to m ulti-point distributions 
bv lMustanha & Dimitrakopoulosl (120101) . 

As mentioned above the Gamma expansion requires the use 
of the Gamma distribution (pc defined as 


4 >g(u) = 


- e , 

GY(k) 


( 12 ) 


where T is the Gamma function (for an integer n. Tin + 1) = n\, 
8 and k are two parameters which are related to the two first 
moments of the PDF. If the galaxy probability density function 
is well described by a Gamma expansion at order n then it can 
be formally written as 

Pi A) = 4>ciu)ft l \u), (13) 


where by definition u = 4,k = -4, 8 = 4 = -A. The function 

J v cr. k A 

A 

f\ k l) represents the expansion aiming at tuning the moments of 
the Gamma distribution; note that the exponent (k - 1) is not the 
derivative of order k — 1. Since this expansion is built upon the 
orthogonal properties of products of Laguerre polynomials with 
the Gamma distribution, the function /jf " is given by the sum 

ft'Hx) = Y^c.Lf-'Hx), (14) 

i=0 

where L lk 11 are the generalized Laguerre polynomials of order 
i and the coefficients c, represent the coefficients of the Gamma 


where hi = 4 n 0 L + t and in this case we use the notation h (N> - 

1 i! (1 +#)' +A i 

d N L- 

-gyL The successive derivatives of /r, can be obtained from the 
recursive relation 



(i + k)’J 
(1 + 8) m 


hf- m \8). 


In addition to the fact that having the possibility of computing 
the corresponding observed Pn without requiring an infinite in¬ 
tegral for each number N is computationally more efficient, it is 
also practical to have the analytical calculation for some pecu¬ 
liar values of the k parameter of the distribution. In fact, when 
k is lower than 1 which occurs on small scales (4/z~'Mpc), the 
probability density function goes to infinity when A goes to 0 
(although the distribution is still well defined). In particular, this 
numerical divergence would induce large numerical uncertain¬ 
ties in the computation of the void probability Pq. In addition, 
one can see that for the void probability we have the simple re¬ 
lation 



Ci 


T(k + i ) 
T(k) 


M6), 


(17) 


which can be used to recover the true void probability in 
VIPERS. 


4. Application of the methods on a synthetic galaxy 
distribution 

In this section we analyse a suite of synthetic galaxy distribu¬ 
tions generated from 20 realizations of a Gaussian stochastic 
field. The full process involved in generating these bench-mark 
catalogues is detailed in Appendix [C] Each comoving volume 
has a cubical geometry of size 500/z 'Mpc. We generate the 
galaxies by discretizing the density field according to the sam¬ 
pling conditional probability P[A|A] which we assume to be a 
Poisson distribution with mean A. In this way we know the true 
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underlying galaxy density contrast 6. We can therefore perform 
a fair comparison between the methods introduced in §3. 

In order to avoid the effect of the grid (0.95/z _1 Mpc) we 
smooth both the density field and the discrete field using a spher¬ 
ical Top-Hat filter of radius R - 8/z ' Mpc. We apply the three 
methods mentioned in §3 and compare the reconstructed proba¬ 
bility density function to the expected one obtained directly from 
the density field 5. 

The discrete distribution of points contains an average num¬ 
ber of object per cell N — 8 which is the one expected according 
to our sampling process. The corresponding Pn is given by the 
black histogram in the lower panel of Fig. (0, from this measure¬ 
ment we apply the three methods R-L, SLN and T e and obtain an 
estimation of the probability density function corresponding to 
each method. In the upper panel of Fig. 0 we compare the per¬ 
formance of the three methods in recovering the true probability 
density function (black histogram referred to as reference in the 
inset). Note that, for this test case, we use a Gamma expansion 
at order 4 in order to be coherent with the order of the expansion 
of the Skewed Log-Normal. We have also represented the proba¬ 
bility density function estimated when neglecting the shot noise 
(red dotted line), which is used as the initial guess in the case of 
the R-L method. 

From the top panel of Fig. 0 we can conclude that the three 
methods perform reasonably well. It seems that the T e method 
reproduces better the density distribution of under-dense regions 
(5 ~ -1) but this is expected in the sense that the distribution 
used to generate the synthetic catalogues is a Gamma distri¬ 
bution (see Appendix IQ. Although, it is not obvious because 
the scale on which the density field has been set up is one or¬ 
der of magnitude smaller than the scale of the reconstruction 
R = 8/i _1 Mpc. 

The performance of the three methods is also represented 
in the bottom panel of Fig. 0, in which we compare the ex¬ 
pected observed Py from each method to the true one. One can 
see that they all agree at the 15% level, hence it is not possible 
to conclude that one is better than an other. This was actually 
expected, from the comparison on the underlying density field 
(Fig0. On the contrary if one of the methods would not agree 
with the PDF then we would expect also a disagreement on the 
observed CPDF (see 0. 

In the following part we investigate the sensitivity of the 
three methods with respect to the shot noise. In fact, as shown 
in Fig. 0. in most of the sub-samples of VIPERS PDR-1 we 
will work with a high shot noise level (N < 1). We therefore 
randomly under-sample the fake galaxy distribution by keeping 
only 10% of the total number of object contained in each comov¬ 
ing volume. This process gives an average number per cell of 0.8, 
which is more representative in the context of the application of 
the reconstruction method. We perform the same comparison as 
in the ideal case (N — 8) and found that the R-L method appears 
to be highly sensitive to the shot noise. In fact if the mean num¬ 
ber of object per cell is too few then the output of the method de¬ 
pends too much on the initial guess. It follows that, if it is too far 
from the true PDF the process does not converge (see top panel 
of Fig. 0 and the corresponding P^ does not match the observed 
one (see bottom panel of Fig. 0. Note that we explicitly checked 
this effect by increasing the number of iterations from 30 to 200. 
While in the case of both, the SLN and the Gamma expansion, 
one can see in Fig. 0, the output probability density function is 
in agreement (with a larger scatter) with the one obtained in the 
N — 8 case. This means that the sensitivity regarding to the shot 
noise is much smaller when considering parametric methods. 


N -8.0 



Fig. 2. Upper. The black histogram with error bars shows the 
true underlying probability density function (referred to as refer¬ 
ence in the inset) compared to the reconstruction obtained with 
the R-L (red dashed line), the SLN (green dot-dashed line), and 
the r e (blue long dashed line) methods. The red dotted histogram 
shows the PDF used as the initial guess for the R-L method and 
the colored dotted lines around each method line represent the 
dispersion of the reconstruction among the 20 fake galaxy cata¬ 
logues. We also display the relative difference of the result ob¬ 
tained from each method with respect to the true PDF. Lower. 
The black histogram with error bars shows the observed proba¬ 
bility density function (referred to as reference in the inset) com¬ 
pared to the reconstruction obtained with the R-L (red dashed 
line), the SLN (green dot-dashed line), and the T e (blue long 
dashed line) methods. We also display the relative difference of 
the result obtained from each method with respect to the ob¬ 
served P N . 


Considered the sensitivity of the R-L method to the initial 
guess, knowing that the average number of galaxies per cell can 
be lower than unity and finally taking into account computa¬ 
tional time, we shall continue our analysis only using the two 
parametric methods SLN and F e . In the following, we will com¬ 
pare them using more realistic mock catalogues but for which 
we don’t know apriori the true underlying PDF. 
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N -0.8 



N 

Fig. 3. Same as in Fig. (0 but we use only 10% of the galaxies 
contained in the fake galaxy catalogues as a result the average 
number of galaxy per cell drops from N = 8 to N = 0.8. 


5. Performances in realistic conditions 

In this section we discuss how observational effects have been 
accounted for in our analysis and test the robustness of the recon¬ 
struction methods SLN and Gamma expansion. For this purpose 
we use a suite of mock catalogues created from the Millenium 
simulatio n, they are also used in the analysis performed by 
Idi Porto et al . ( '201 4). 

We shall compare the reconstruction methods between 
two catalogues, namely REFERENCE and MOCK. The ref¬ 
erence is a galaxy catalogue obtained from semi-analytical 
models. We simulate the redshift errors of VIPERS PDR-1 
by perturbing the redshift (including distortions due to pe¬ 
culiar motions) with a Normally distributed error with rms 
0.00047(1 + z). Each MOCK catalogue is built from the cor¬ 
responding REFERE NCE catalogue by app lying the same ob¬ 
servational strategy dde la Torre et al.l 1 2013 ) which is applied 
on VIPERS PDR-1; spectroscopic targets are selected from the 
REFERENCE cata logue b y apply ing the slit-positioning algo¬ 
rithm (SPOC, [Bottini et al.ll2005l) with the same setting as for 
the PDR-1. This allows us to reproduce the VIPERS footprint 
on the sky, the small-scale angular incompleteness due to spec¬ 
tra collisions and the variation of target sampling rate across the 
fields. Finally, we deplete each quad rant to reproduce the eff ect 
of the survey success rate (SSR, see ide la Toire et al.ll2013 ). In 


Table 2. List of the magnitude selected objects (in B-band) in 
the mock catalogues 


£min 

£min 

luminosity 

M s - 5 log(ft) < 

0.5 

0.7 

-18.42 - z 

0.5 

0.7 

-19.12 - z 

0.5 

0.7 

-19.72 - z 

0.7 

0.9 

-19.12 - z 

0.7 

0.9 

-19.72 -z 

0.9 

1.1 

-19.72 -z 



0 5 10 15 200 10 20 30 400 20 40 60 


Fig. 4. Comparison between the SLN and F e methods at 0.9 < 
z < 1.1. Each panel corresponds to a cell radius R of 4, 6 and 
8/r'Mpc from the left to the right. Top: The red histogram shows 
the observed PDF in the MOCK catalogues while the black 
histogram displays the PDF extracted from the REFERENCE 
catalogues. The blue diamonds with lines and the magenta tri¬ 
angles show, respectively, the F e expansion performed in the 
REFERENCE and MOCK catalogues. On the other hand, the 
cyan diamonds with lines and the orange triangles show, re¬ 
spectively, the SLN expansion performed in the REFERENCE 
and MOCK catalogues. Bottom: Relative deviation of the r e and 
SLN expansions applied both on the REFERENCE and MOCK 
catalogues with respect to the PDF of the REFERENCE cata¬ 
logues. 


this way, we end up with 50 realistic mock catalogues (named 
MOCK hereafter), which simulate the detailed survey complete¬ 
ness function and observational biases of VIPERS in the W1 and 
W4 fields. 

In order to perform a similar analysis as the one we aim at 
doing for VIPERS PDR-1, we construct sub-samples of galax¬ 
ies selected according to their absolute magnitude Mb in B-band; 
we take all objects brighter than a given luminosity. We list those 
samples in Tab. 0, we have in total 6 galaxy samples. The high¬ 
est luminosity cut (M^-5 log(/r) < 19.72—z) allows us to follow 
a single population of galaxies at three cosmic epocs. 

In Fig. [4J0 and [6] we show the reconstruction performances 
for the SLN and the F e method. We consider the same popu¬ 
lation (Mb - 5 log It + z < -19.72) but in three redshift bins, 
0.9 < z < 1.1, 0.7 < z < 0.9 and 0.5 < z < 0.7. In order to 
test the stability of the methods we perform the reconstruction 
at three smoothing scales, R - 4, 6 and 8/r'Mpc. The compar¬ 
ison is done as follows, on one hand we estimate the true TV 
from the REFERENCE catalogue (before applying the observa¬ 
tional selection) and we perform the reconstruction on it, in this 
way we can test the intrinsic biases due to the assumed para¬ 
metric method (SLN or r e ). On the other hand, we estimate the 
observed Pm in the MOCK catalogues, from which we perform 
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M,-5log(h) <-19.72-z 



Fig. 5. Comparison between the SLN and F e methods at 0.7 < 
z < 0.9. Each panel corresponds to a cell radius R of 4, 6 and 
8/r'Mpc from the left to the right. 



0 5 10 15 200 10 20 30 400 20 40 60 


togram) in the low count regime and the tail is fairly well repro¬ 
duced. In the second place, comparing the r e performed on the 
REFERENCE and the MOCK catalogues (blue diamonds with 
respect to magenta triangles) one can see the loss of information 
due to the observational strategy has at most an impact of 10% on 
the reconstructed CPDF which reduces when considering larger 
cells (less shot noise). 

In general, examination of Fig. © and © confirms that 
for the considered galaxy population the same results hold at 
lower redshifts. However, in particular the reconstruction at R - 
4/z 1 Mpc can exhibit deviations larger than 20%, this is at odds 
with the fact that the shot noise contribution is expected to be the 
same for the three redshift bins (magnitude limited). We attribute 
this larger instability to the fact that not only the shot noise con¬ 
tribution is higher for R = 4/z”'Mpc but also the volume probed 
is smaller when decreasing the redshift. 

The performances of the reconstruction for the last three 
galaxy samples are shown in Fig.© where each row corresponds 
to a galaxy sample (we only show the residual with respect to the 
REFERENCE). This last comparison allows to say that the re¬ 
construction instability at 4/z”'Mpc was indeed due to the high 
level of shot noise. We can conclude that in the HOD galaxy 
mock catalogues, the galaxy distribution is more likely to be 
modelled by a T e instead of an SLN. Finally, for a chosen re¬ 
construction method, the information contained in the MOCK 
catalogues is enough to be able to reconstruct the CPDF of the 
REFERENCE catalogue to better than 10%. 


Fig. 6. Comparison between the SLN and F e methods at 0.5 < 
z < 0.7. Each panel corresponds to a cell radius R of 4, 6 and 
8/; 'Mpc from the left to the right. 


R=4h''Mpc R = 6h~'Mpc R=8h"’Mpc 


1 

■ A * aUl 

A M.-5log(h) <-19.12-t‘ 1 ; 

...., 


i k 1 1 

M,-5log(h) <-19.12- 



A ! F'l 

V, , 1 , , , 1 , , M 


‘ .'. 

r^J 
* * 

■ A _1_■_■_■_1_■_■_,-J 


0 5 10 15 2CD 10 20 30 4CD 20 40 60 

N N N 


Fig. 7. Comparison between the SLN and T e methods. Each col¬ 
umn corresponds to a cell radius R of 4, 6 and 8/z 'Mpc from the 
left to the right, and each row corresponds to a combination of 
redshift and magnitude cut. 


the reconstruction to verify if we recover the expected FV from 
the REFERENCE catalogue. 

Inspecting Fig. © we can first see that the intrinsic error 
due to the specific modeling of the methods is much larger for 
the SLN (cyan diamonds compared to the black histogram) than 
for the T e (magenta diamonds compared to the black histogram). 
From the top panels we see that the SLN does not reproduce the 
tail of the CPDF and from the bottom panel we see that even 
for low counts it is showing deviations as large as 20%. This 
intrinsic limitation is propagating when performing the recon¬ 
struction on the MOCK catalogue (orange triangles compared to 
the black histogram) while for the T e we see that the agreement 
is better than 10% (magenta triangles compared to the black his- 


6. VIPERS PDR-1 data 


In this section we apply the reconstruction method to the 
VIPERS PDR-1. We saw in the previous sections that the SLN 
and r e methods are sensitive to the assumption we make about 
the underlying PDF. In fact, we saw in © that if the underlying 
PDF is close to the chosen model then the reconstruction works. 
We found in © that the galaxy distribution arising from semi- 
analytic models is better described by a T c than an SLN distribu¬ 
tion. However, in the following we will not take for granted that 
the same property holds for the galaxies in the PDR-1. 

We want to choose which one of the two distributions (Log- 
Normal or Gamma) best describes the observed galaxy distribu¬ 
tion in VIPERS PDR-1, when no expansion is applied. Thus, 
we compare the observed PDF to the one expected from the 
Poisson sampling of the Log-Normal probability density func¬ 
tion (PS-LN) and to the one expected from the Poisson sampling 
of the Gamma distribution (the so-called Negative Binomial). 
Error bars are obtained by performing a Jack-knife resampling 
of 3 x 7 subregions in each fields W1 and W4. 

The SP-LN distribution does not have an analytic expres¬ 
sion and must be obtained by numerically integrating Eq. © 
while the Poisson sampling of the Gamma distribution leads to 
the Negative Binomial distribution defined as 


6^ r{r+ l)...(r + N- 1) 
m (i +9) N+r 


( 18 ) 


where 6 = — and r = N \. to ensure that the first two moments 

of the Negative Binomial match those of the observed distribu¬ 
tion. We show in Fig. © the outcome of this comparison, it fol¬ 
lows that the Negative Binomial is much closer to the observed 
PDF than the PS-LN. As a result, the underlying galaxy distribu¬ 
tion is more likely to be described by a Gamma distribution than 
by a Log-Normal. Hence, we only use the Gamma expansion to 
model the galaxy distribution of VIPERS PDR-1. 
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Fig. 8. Observed count-in-cell probability distribution function P # (histograms) from VIPERS PDR-1 for various luminosity cuts 
(indicated in the inset). Each row corresponds to a redshift bin, from the bottom to the top 0.5 < z < 0.7, 0.7 < z < 0.9, 
and 0.9 < z < 1.1. Each column corresponds to a cell radius R - 4,6,8/; 'Mpc from the left to the right. Moreover we added 
the expected PDF from two models which match the two first moment of the observed distribution; the red solid line shows the 
prediction for a Poisson sampled Log-Normal (PS-LN) CPDF while the green dashed line displays the Negative Binomial model 
for the CPDF. 





Moreover, the use of the Gamma expansion instead of the 
SLN simplifies substantially the analysis. In Fig. ([3} we provide 
the reconstructed probability distribution function of VIPERS 
PDR-1 together with the corresponding underlying probability 
density function for each redshift bin and luminosity cut. Each 
panel of Fig. @ shows how the choice of a particular class of 
tracers (selected according to their absolute magnitude in B- 
band) influence the PDF of galaxies. When measuring specific 
properties of the intrinsic galaxy distribution for each luminosity 
cut, it is enough to look at the CPDF however, when comparing 
the distributions with each other it is necessary to take care about 
the averaged number of objects per cell which varies from sam¬ 
ple to sample. As a result it appears more useful to compare the 
properties of the different galaxy samples using their underlying 
probability density function which, assuming Poisson sampling, 
is free from sampling rate variation between different type of 
tracers. 

For the two first redshift bins, we can see that the probabil¬ 
ity density function is broadening when selecting more luminous 
galaxies, this goes in the direction of increasing the linear bias 
with respect to the matter distribution. However, despite a less 
significant trend, for the highest redshift bin it seems that it goes 
in the oposite direction. This trend might be an artifact; indeed 
by analyzing Fig. <[TJ we see that for all these samples the aver¬ 
aged number of object per cell is between 0.2-0.4 which shows 
that theses samples could be highly affected by shot noise effects. 
As a result, specific care should be taken when interpreting those 
three high redshift samples. 

In the following we focus on the evolution of the underlying 
PDF for a particular class of objects on the wide redshift range 
probed by VIPERS PDR-1. The Fig. (ITOt displays the outcome 
of this study, it shows how the PDF, for three populations (the 


three highest magnitude cuts), evolves regarding to the redshift 
at which it is measured. The three populations (top, middle and 
bottom panels) exhibit non-monotonic evolution with respect to 
the redshift. In particular, the more luminous population is show¬ 
ing that the PDF at 0.9 < z < 1.1 appears to be systematically 
different than in the two lower redshift bins. However, we see 
also that some instabilities are appearing in the reconstruction 
(see wiggles at high 1 + 6). This might be due to the fact that 
we have fewer galaxies in this sample giving rise to a large shot- 
noise contribution (N < 0.3). We indeed verified that for the 
high mass bin and the two other galaxy populations we vary the 
order of the expansion from 6 to 4 the resulting PDF changes by 
less than 1-cr while for the most luminous population, truncat¬ 
ing the expansion at order 4 only removes the instability with¬ 
out changing significantly the overall behavior of the PDF. This 
consistency test shows that the radical change in the measured 
PDF for the highest redshift bin appears to be the true feature. 
Probably only the final VIPERS data set will be able to give a 
robust conclusion. 

Finally, in Tab. ©. we list the relevant coefficients of the 
Gamma expansion which we measured from the VIPERS PDR- 
1 at the scale R = 6/z -1 Mpc. They can be used in order to model 
both the CPDF (Eq.[l6) and the PDF (Eq.[j3j. 


7. Summary 

The main goal of the present paper is to measure the probabil¬ 
ity of finding N galaxies falling into a spherical cell randomly 
placed inside a sparse sampled (i.e. with masked areas or with 
low sampling rate) spectroscopic survey. Our general approach 
to this problem has been to use the underlying probability den¬ 
sity distribution of the density contrast of galaxies in order to 
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0.5< z <0.7 0.7< z <0.9 0.9< z <1.1 



N N N 



Fig. 9. Top: Reconstructed PDF applying the F e method in three redshift bins (from left to right) at the intermediate smoothing 
scale R - 6h 'Mpc. Bottom: Underlying PDF corresponding to the CPDF in the top panel, for each luminosity cut the 1-sigma 
uncertainty is represented by the dotted lines. 


Table 3. Coefficients of the I e expansion which describe the VIPERS PDR-1 data for R = 6h 'Mpc 


z 

Mg - 5 log(/r) 

k 

e 

C3 

c 4 

C5 

C6 

0.5-0.7 

-18.6 - z 

0.87961819 

4.5053822 

-0.027583435 

-0.030026522 

-0.018218867 

-0.019292756 


-19.1 -z 

0.78883961 

3.2677238 

-0.011759548 

-0.0041201299 

0.0076149367 

-0.0010233871 


-19.5 -z 

0.72531432 

2.2643581 

-0.020667396 

0.00070338969 

0.021056193 

-0.00061403852 


-19.7 -z 

0.64267892 

1.4068744 

-0.034276861 

-0.022797814 

0.022229339 

0.023963984 


-19.9 -z 

0.64267892 

1.4068744 

-0.0071341640 

-0.0072444524 

-0.0030038079 

-0.045733910 

0.7-0.9 

-19.1 -z 

0.76911853 

2.9737929 

-0.063844766 

-0.046627985 

-0.032441385 

-0.067589757 


-19.5 -z 

0.73969794 

2.0841542 

-0.032831012 

-0.032693436 

-0.028383261 

-0.064019117 


-19.7 -z 

0.70270085 

1.6638888 

-0.019063352 

-0.048572844 

-0.061832661 

-0.078445546 


-19.9 -z 

0.67984433 

1.2608492 

0.013646925 

-0.028325455 

-0.042087256 

-0.021113201 

© 

VO 

1 

-19.5 -z 

0.47473429 

1.3138704 

-0.10794135 

-0.17074978 

-0.10267837 

-0.0089188521 


-19.7 -z 

0.49470455 

1.0926144 

-0.075805086 

-0.16739016 

-0.13623398 

-0.019540367 


-19.9 -z 

0.48382041 

0.90259279 

-0.076620326 

-0.20604275 

-0.23060122 

-0.14506575 


recover the counting probability corrected from sparseness ef¬ 
fects. We therefore compared three ways (R-L, SLN and F ( .) of 
measuring the probability density of galaxies classified in two 
categories; direct and parametric. We found that when the sam¬ 
pling is high (IV- 10) the direct method (Rychardson-Lucy de- 
convolution) performs well and avoids putting any prior on the 
shape of the distribution. On the other hand, we saw that when 
the sampling is low (N - 1) the direct method fails to converge 
to the true underlying distribution. We thus concluded that, in 
such cases, the only alternative is to use a parametric method. 

We presented two parametric forms aiming at describing the 
galaxy density distribution, the SLN which is often used in the 
literature to model the matter distribution and the r e . Despite the 
fact that the two distributions used in this paper have been al¬ 
ready investigated in previous works, the approach we propose 
to estimate their parameters is completely new. Previously, fit¬ 
ting procedures were used in order to estimate them. Here we 
propose to measure directly the parameters of the distributions 
from the observations. The method can be applied to both dis¬ 


tributions SLN and F e and decreases considerably the computa¬ 
tional time of the process. 

Relying on simulated galaxy catalogues of VIPERS PDR1, 
we tested the reconstruction scheme of the counting probabil¬ 
ity (/fv) under realistic conditions in the case of the SLN and 
r e expansions. We found, that the reconstruction depends on the 
choice of the model for the galaxy distribution. However, we 
have also shown that it is possible to test which distribution bet¬ 
ter describes the observations. 

Using VIPERS PDR1, on the relevant scales investigated in 
this paper (R - 4,6,8/r 'Mpc), we found that the I distribution 
gives a better description of the observed / J v than the one pro¬ 
vided by the Log-Normal (see Fig. [8}. We therefore adopted the 
r e parametric form in order reconstruct the probability density 
functions of galaxies. From these reconstruction we studied how 
their PDF changes according to their absolute luminosity in 13- 
band and we also studied their redshift evolution. We found that 
little evolution has been detected in the two first redshift bins 
while it seems that the density distribution of the galaxy field is 
strongly evolving in the last redshift bin. 
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1+5 

Fig. 10. Evolution of three galaxy populations selected accord¬ 
ing to their luminosity (from bottom to top). On each panel, the 
black solid, red dashed and, cyand dot-dashed lines represent, 
respectively, the three redshift bins 0.5 < z < 0.7, 0.7 < z < 0.9 
and, 0.9 < z < 1.1. 


We finally used, the measured pdf in order to reconstruct the 
counting probability (CPDF) one would observe if VIPERS was 
not masked by gaps between the VIMOS quadrants. 

Acknowledgements. JB acknowledges useful discussions with E. Gaztanaga. 
We acknowledge the crucial contribution of the ESO staff for the manage¬ 
ment of service observations. In particular, we are deeply grateful to M. Hilker 
for his constant help and support of this programme. Italian participation to 
VIPERS has been funded by INAF through PRIN 2008 and 2010 programmes. 
JB, LG and BJG acknowledge support of the European Research Council 
through the Darklight ERC Advanced Research Grant (# 291521). OLF ac¬ 
knowledges support of the European Research Council through the EARLY 
ERC Advanced Research Grant (# 268107). AP, KM, and JK have been sup¬ 
ported by the National Science Centre (grants UMO-2012/07/B/ST9/04425 
and UMO-2013/09/D/ST9/04030), the Polish-Swiss Astro Project (co-financed 
by a grant from Switzerland, through the Swiss Contribution to the en¬ 
larged European Union), the European Associated Laboratory Astrophysics 
Poland-France HECOLS and a Japan Society for the Promotion of Science 
(JSPS) Postdoctoral Fellowship for Foreign Researchers (PI 1802). GDL ac¬ 
knowledges financial support from the European Research Council under the 
European Community’s Seventh Framework Programme (FP7/2007-2013)/ERC 
grant agreement n. 202781. WJP and RT acknowledge financial support from 
the European Research Council under the European Community’s Seventh 
Framework Programme (FP7/2007-2013)/ERC grant agreement n. 202686. WJP 
is also grateful for support from the UK Science and Technology Facilities 
Council through the grant ST/1001204/1. EB, FM and LM acknowledge the 
support from grants ASI-INAF 1/023/12/0 and PRIN MIUR 2010-2011. CM is 
grateful for support from specific project funding of the Institut Universitaire de 
France and the LABEX OCEVU. 


References 

Bel, J. & Marinoni, C. 2012, MNRAS, 424, 971 

Bel, J., et al. (the VIPERS Team) 2014, A&A, 563, A37 

Bernardeau, F., Colombi, S., Gaztaaga, E. & Scoccimarro, R. 2002, PR, 367, 1 

Bottini D„ Garilli, B., Maccagni, D., et al. 2005, PASP, 117, 996 

Coles, P„ Jones, B. 1991, MNRAS, 248, 1 

Colless, M„ Dalton, G., Maddox, S., et al. 2001, MNRAS, 328, 1039 
Colombi, S. 1994, ApJ, 435, 536 


de la Torre, S., Guzzo, L., Kovac, K., et al. (the ZCOSMOS collaboration) 2010, 
MNRAS, 409, 867 

de la Torre, S., Guzzo, L., Peacock, J.A., et al. (VIPERS team) 2013, A&A, 
557A, 54 

di Porto, C., et al. (VIPERS team) 2014, submitted, arXiv: 1406.6692 D 
Eisenstein, D. J. & Hu, W. 1998, ApJ, 496, 605 

Garilli, B., Le Fevre, O., Guzzo, L., et al. (the VVDS collaboration) 2008, A&A, 
486, 683 

Garilli, B„ Paioro, L„ Scodeggio, M. et al. 2012, PASP, 124, 1232 
Garilli, B., Guzzo, L., Scodeggio, M., et al. (the VIPERS team) 2014, A&A, 562, 
23 

Gaztanaga, E.. Fosalba, P. & Elizalde, E. 2000, ApJ, 539, 522 
Greiner, M„ En/?lin, T. A. 2015, A&A, 574, 86 

Guzzo, L., Pierleoni, M., Meneux, B., et al. (the VVDS team) 2008, Nature, 451, 
541 

Guzzo, L., Scodeggio, M., Garilli, B., et al. (the VIPERS team) 2014, A&A, 566, 
108 

Layzer, D. 1956, AJ, 61, 383 

Le Fevre, O., Saisse, M., Mancini, D., et al. 2003, Proc. SPIE, 4841, 1670 
Le Fevre, O. Vettolani, G., Garilli, B., et al 2005, A&A, 439, 845 
Lilly, S. J., Le Brun, V., Maier, C., et al. (the ZCOSMOS collaboration) 2009, 
ApJS, 184, 218 

Marchetti, A., Granett, B.R., Guzzo, L., et al. (the VIPERS team) 2012, 
MNRAS, in press, arXiv: 1207.4374 

Mellier, Y., Bertin, E., Hudelot, P, et al. 2008, The CFHTLS T0005 Release, 
http://terapix.iap.fr/cplt/oldSite/Descart/ CFHTLS-T0005-Release.pdf 
Mustapha , H. & Dimitrakopoulos, R. 2010, C&M, 60, 2178 
Newman, J. A., Cooper, M.C., Davis, M., et al. (the DEEP2 collaboration) 2012, 
arXiv: 1203.3192 

Oke. J. B. & Gunn, J. E. 1983, ApJ, 266, 713 

Scodeggio M., Franzetti P, Garilli B., et al. 2009, Msngr, 135, 13 

Szapudi, I. & Pan, J. 2004, ApJ, 602, 26 


Appendix A: Non-linear system 

The problem of this system of equations is that it is non-linear, it 
is therefore difficult to solve however it can be reduced to a one 
dimensional equation which can be solved numerically. 

The two first equations (n - 1 and n - 2) can be used to ex¬ 
press the two first cumulants with respect to the third and fourth 
order ones 


o% = ln(A 2 ) + In 


'B 2 ^ 

& 


B<S> 


ln(A 2 ) + In 


‘E ^ 

T 2 


(A.l) 
(A.2) 


where B\ and B 2 are both functions of x and y. Then using other 
combinations of equation one can express a system of two equa¬ 
tions for x and y alone 


B\ = aiB\B 4 (A.3) 

B 2 B\ = a 2 B\, (A.4) 

A 2 

where a\ = -^ and a 2 = In order to solve properly the sys¬ 
tem we prefer to express it in term of one parameter // = B 2 /B i, 
moreover one can see that polynomials B\ to B\ are not indepen¬ 
dent, as a result 


B 4 — d + ciB\ + bB 2 + cB 3, 

where a — 96,b = -32 ,c = ^,d — --^ and which can be 
substituted in Eq. (I A. 3b . Combining Eq. (1A.3I) and Eq. ( IA.4b one 
obtains a parametric equation for B\ 

(a + bjf)B\ +(d + cf(jj))B\ - g{rj) = 0, (A.5) 
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which can be solved for each value of the parameter r\ and an 
independent parametric equation for B \ 

#3 = fOf). 

As a result we can find a couple B\, /)? for each value of the 
parameter rj, it follows that one can express x and y with respect 
to Tj and given the definition of r\ the possible solution x and y 
must satisfy the condition 

Bi[x{rf),y{ri)] - TjBi[x(r]),y(T])] = 0, 

which gives the possible values of r\ from which one can recover 
x and y. Finally, from Eq. ( 1A. I | i and Eq. ( IA.2b we can compute 
the values of cr® and p® corresponding to each couple (x, y) of 
solutions. This allows us to select the solution which provides a 
value of Ag closer to the observed one. 

Once the values of the cumulants p®, cr^, (<t> 3 ) c and (<1> 4 ) C 
are known from the process detailed above, we know that the 
moments of the corresponding P'£ will match those of the ob¬ 
served on up to order 4. At the end, one can check whether the 
SLN distribution provides a good match to data by integrating 
numerically the probability density function convolved with the 
Poisson kernel K (see Eq.0). 


Appendix B: Generating function 

We show that the CPDF associated to a Gamma expanded PDF 
can be calculated analytically from an expression which depends 
explicitly on the coefficients c, of the Gamma expansion. 

Be the generating function associated to the probability 
distribution Pjq, it is defined as 

oo 

g N (A) ee ^ A N P N . (B.l) 

i=0 

In case of the Poisson sampling of a Gamma distribution, after 
some algebra, one can show that it can be expressed with respect 
to the coefficients of the Gamma expansion as 


1 


= -W)T^ CiF,{y) ' 


(B.2) 


where y = (1 — A)6 and 

/-»oo 

Fiiy) = ^- l e~ x Lf- l \x)e- yx dx. 

Jo 

Nevertheless, this integral can be computed using the Laguerre 
expansion of the exponential 


2 


^d+y) 


7 - - rL ia \x), 

i+a+1 i v 


it reads to 


Fiiy) 


y r (i + k) 

(1 + y)' +k ;'! 


(B.3) 


The formal expression of the generating function is therefore 
given by 


&n{X) - 


a + jr k 
m 


z 


1=0 


T(i + k) 
Ci —-— 



(B.4) 


where we still use y = (1 - A)0. From the explicit expression 
of the moment generating function (Eq. IB.4b one can get the 
probability distribution P^ by iteratively deriving the generating 
function with respect to y 

i d N g N jA) j-ef d N g N ( 7 ) 

N N\ (U* 1=Q N\ d y" y=g - 

These derivatives can be calculated explicitly. 


Appendix C: Synthetic galaxy catalogues 

In this Appendix we describe how we generate synthetic galaxy 
catalogues from Gaussian realizations. The first requirement of 
these catalogues is that they must be characterized by a known 
power spectrum and 1-point probability distribution function. 
The second requirement is that the probability distribution func¬ 
tion must be measurable. 

The basic idea is simple, we generate a Gaussian random 
field in Fourier space (assuming a power spectrum), we inverse 
Fourier transform it to get its analog in configuration space. We 
further apply a local transform in order to map the Gaussian field 
into a stochastic field characterized by the target PDF. The two 
crucial step of this process are the choice of the input power 
spectrum and the choice of the local transform. 

Be v a stochastic field following a centered ((v) = 0) reduced 
(cr 2 = (v 2 )c = 1) Gaussian distribution. From a realization of 
this field, one can generate a non-Gaussian density field 6 by 
applying a local mapping L between the two, hence 

8=Uy). (C.l) 

The local transform L must be chosen in order to match some 
target PDF P$ for the density contrast 6. Assuming that the local 
transform is a monotonic function which maps the ensemble ] - 
oo, +oo[ into ] - 1, +oo[ then, due to the probability conservation 
Pg(5)d6 = P v (v) dv, the local transform must verify the following 
matching 

C s [6\ = C v [v], (C.2) 

where C A stands for the cumulative probability distribution func¬ 
tion. Be [ a,b\ the definition assemble of the variable x then 
its cumulative probability distribution function is defined as 
C A [x] = £ P x (x') dx', where P x is the PDF of x. By definition a 
probability density function is positive, it follows that its cumu¬ 
lative is a monotonic function and therefore Eq. (1C.21) can always 
be inverted, it reads 


6 = C~Y [Cv(v)], 

where the exponent -1 stands for the reciprocal function such 
that F _1 [F(x)] = x. For example, by definition the local map¬ 
ping L which allows transform a Normal distribution into a Log- 
Normal distribution is 5 — e v - 1. Note that depending on the 
PDF that must be matched this inversion can require a numeri¬ 
cal evaluation which can be tabulated. 

Once a local transform is chosen, we need to adress the 
question of finding the appropriate power spectrum of the 
Gaussian field v which, once locally mapped into the density 
field d, will match the expected power spectrum. Following 
[Greiner & EnBlin I (1 201 5l) . who considered a log-transform we 
generalized their result to a generic local transformation. This 
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mapping is not direct in Fourier space while it is in configura¬ 
tion. Writing the two point moment of order two of the density 
field 5 and assuming the probability conservation leads to 

6 = = J"l(vi)L(v 2 )S(vi, v 2 ,£ v )dvidv 2 , (C.3) 

where 8 is a bivariate Gaussian defined as 

S(v " 55 {-^'j ■ (C4) 


Once the local transform L and the 2-point correlation map¬ 
ping A are known, then the input power spectrum of the Gaussian 
field v can be obtained as follow . We choose a power spectrum 
P(k), in the present case lEisenstein & Hu! (t 19981) . for the density 
field 6, we calculate its corresponding 2 -point correlation func¬ 
tion 


One can notice that in our case (central reduced Gaussian) the 

1 & 


covariance matrix C v takes the simple form C v 


& 1 


. Once 


integrated over the definition domain of vi and v 2 , Eq. Q 
provides a mapping between the 2 -point correlation function of 
the Gaussian field v and the 2-point correlation function of the 
density field 5. However, we prefer to rotate the coordinate sys¬ 
tem before performing the integral (IC.3b because in case of high 
correlation (~ 1 ) then the gaussian will be comparable with a 
straight line; most of the sampling of this function will be use¬ 
less. That’s why we look for the rotation allowing to diagonalize 
the matrix C v and therefore convert v into a new variable x. It 
follows that 


Wv 0 
0 l+£ v 


& = 


/ 


P(k)e' kr d'k. 


(C. 8 ) 


At each scale r, one can deduce the 2-point correlation function 
of the Gaussian field £ v = A ~ 1 (££) and finally using a Fourier 
transform we obtain the input power spectrum 


and the integral becomes 


& - 


2 


f d«,(C5) 


where cry = 1 - £ y and cr- = 1 + £ v we can therefore integrate 
over a bounded domain corresponding to the — 8 or, 8 cr! along 
the x\ axis and - 8 <t 2 , 8 ct 2 along the x 2 axis. An other possibility 
to perform the integral ICAl is to use the Mehler’s formula, doing 
so, one can show that the 2 -point correlation of the density field 
can be expressed as a Taylor expansion on the 2-point correlation 
function of the v field. It reads. 


Pinik) 


1 

(2n) 3 


f 


g v (r)e~ ikr d 3 r. 


(C.9) 


& = ^ n '-clC’ 

71=0 


(C. 6 ) 


where the c n are the coefficients of the Hermit transform of the 
local mapping L(v) = 2 ,^l 0 c„//„(v) and they can be calculated 
using the orthogonal properties of Hermit polynomials 

Cn — “t f' L(v)H n (v)Py(v) dv. (C.7) 

n\ o 


Finally, in order to make sure that the PDF target will be re¬ 
produced, it is necessary to verify that, once the input power 
spectrum P/„(k) have been set up on regular k-space grid which 
will be used to generate the Gaussian field, its integral is in¬ 
deed equal to the expected variance on the size of the mesh. 
a 2 = (^t ) 3 P(kn) should be equal to cr 2 — f P(k)d 3 k. In 
general, cr a and & a are not equal, thus we renormalize the target 
power spectrum by the quantity S — fr 2 lcr 2 , Pi„{k) = S Pu n (k). 


The latter approach considerably speed up the numerical eval¬ 
uation of Eq. (1C.5L it allows to compute the 2-D integral as a 
finite sum of 1-D integrals. It also allows to verify that when the 
2 -point function of the field v is positive then the derivative of 
£5 with respect to £ v is positive. Moreover, from Ea. (lC.3l > one 
can see that = 0 implies £ () — 0. This means that the function 
which transforms into £ 0 is invertible as long as £ n is positive. 
On the other hand we know that the zero-crossing of the 2-point 
correlation function occurs at very large scales at which one can 
safely assume that |^| << 1 thus by continuity one can truncate 
the Eg. (1C.61) at order one providing a linear relation between 
and £ v . As a result, one can take the reciprocal of the function A 
such that = /T 1 ^). 


We generate a Gaussian field (with a flat power spectrum), 
on a regular mesh of a — 0.95/z 'Mpc and a comoving box of 
500 3 /i _3 Mpc J . We then Fourier transform with an FFT and keep 
only the phases of the field = e‘ mk) . We generate at each po¬ 
sition k„ the value of the modulus of v/t = \[X^ : e' Mk) , where 
Xk = -Pi n (k) Inf 1 - e) and e is a random number with a uniform 
probability distribution between 0 and 1. We then inverse Fourier 
transform the field to get a centered reduced Gaussian field. In 
Fig. ( 1C.Il l we show the input power spectrum of the Gaussian 
field v compared to the one measured using a FFT, and to the 
one expected from the local transformation applied to the v field 
in order to generate the density field 6. 
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k [h Mpc"] 

Fig. C.l. Upper: Grey dotted lines show the power spectrum 
measured in each of the 20 fake galaxy distributions, the black 
solid line represent their average and the errors display the dis¬ 
persion of the measurements. The blue long dashed line dis¬ 
plays the input power spectrum used too generate the Gaussian 
stochastic field nu and the red dashed line shows the correspond¬ 
ing expectation value for the power spectrum of the density con¬ 
trast 6. Lower. Shows the deviation between the measured power 
spectrum of the (5-field and the expected one. 
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